Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make flake in configMap update e2e easier to debug #21573

Merged
merged 1 commit into from
Feb 19, 2016

Conversation

pmorie
Copy link
Member

@pmorie pmorie commented Feb 19, 2016

Help with #21244

@ihmccreery @ixdy

@k8s-github-robot
Copy link

Labelling this PR as size/XS

@k8s-github-robot k8s-github-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Feb 19, 2016
@saad-ali
Copy link
Member

LGTM

@saad-ali saad-ali added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 19, 2016
@@ -181,7 +181,7 @@ var _ = Describe("ConfigMap", func() {
// Kubelet projects the update into the volume and the container picks
// it up. This timeout is based on the default Kubelet sync period (1
// minute) plus additional time for fudge factor.
const podLogTimeout = 90 * time.Second
const podLogTimeout = 150 * time.Second
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem here is that my timeout in the original test was too optimistic. It may take a full sync interval in the kubelet to pick up the change; bumping way up to avoid future timeouts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guess there's no way to programatically make the sync happen sooner?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to link this directly to the sync interval, so it floats against it?

Also, it should be ≤2*interval, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is an update that needs to be processed by a pod worker, we shouldn't go fine with timeouts. The kubelet needs to wake up and invoke the worker. You can probably engineer a timeout that matches pretty fine with when a pod worker is woken up. But if a worker is already doing work, or many workers are up and competing for cpu, there's very little you can do to guarantee it does work exactly on some edge of the kubelet sync.

Employ the same strategy as we do for pod readiness? (poll till 5m)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Polling for 5 min sounds legit -- does that mean this should move to [Slow]?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are too many tests that poll for 5m in gce normal. I think we can catch if this just keeps crossing 5m via the test dashboard, I don't think any of the tests that currently poll for 5m actually do. If one of the generic timesout in e2e/util are applicable in this case, just use that and we'll bump all at the same time if we do?

@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e test build/test passed for commit b53584c0fab3b4e0445545bb7c827a8b1b25d75b.

@k8s-github-robot
Copy link

PR changed after LGTM, removing LGTM.

@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 19, 2016
@k8s-github-robot
Copy link

Labelling this PR as size/S

@pmorie
Copy link
Member Author

pmorie commented Feb 19, 2016

@bprashanth @ihmccreery bumped timeout to 5m per @bprashanth's idea.

@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e test build/test passed for commit b311c9e95ea6de8a10be8532d48d626a08965df3.

@bprashanth
Copy link
Contributor

If this consistenly takes 5m it's a bug we should address. Otherwise I'm fine with another bug saying "investigate config map slowness" etc (maybe filed on yourself? ;) and bumping it. It will validate the theory that this is in fact a sluggish podworker, and we can catch it if it keeps exceeding 5m.

@bprashanth bprashanth added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 19, 2016
@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e build/test failed for commit f8d58ac.

@pmorie
Copy link
Member Author

pmorie commented Feb 19, 2016

@k8s-bot test this issue: #21463

@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e build/test failed for commit f8d58ac.

@pmorie
Copy link
Member Author

pmorie commented Feb 19, 2016

@k8s-bot test this issue: #21463

@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e test build/test passed for commit f8d58ac.

@bprashanth
Copy link
Contributor

@k8s-oncall trap is green so I'm hand merging. This is clarity and timeout bump.

bprashanth added a commit that referenced this pull request Feb 19, 2016
Make flake in configMap update e2e easier to debug
@bprashanth bprashanth merged commit 6a1156d into kubernetes:master Feb 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants